---
title: Highlighting
noindex: true
---

<Danger>
  **Legacy Docs:** This page describes our legacy API. It will be deprecated in
  a future version. Please use the [v2 API](/) where possible.
</Danger>

<Note>
  Highlighting is an expensive process and can slow down query times. We
  recommend passing a `LIMIT` to any query where `pdb.snippet` is called to
  restrict the number of snippets that need to be generated.
</Note>

<Note>
  Highlighting is not supported for queries that use fuzziness, like
  `paradedb.fuzzy_term`.
</Note>

Highlighting refers to the practice of visually emphasizing the portions of a document that match a user's
search query.

## Basic Usage

`pdb.snippet(<column>)` can be added to any query where an `@@@` operator is present.
The following query generates highlighted snippets against the `description` field.

```sql
SELECT id, pdb.snippet(description)
FROM mock_items
WHERE description @@@ 'shoes'
LIMIT 5;
```

By default, `<b></b>` encloses the snippet. This can be configured with `start_tag` and `end_tag`:

```sql
SELECT id, pdb.snippet(description, start_tag => '<i>', end_tag => '</i>')
FROM mock_items
WHERE description @@@ 'shoes'
LIMIT 5;
```

## Fragment Size

For every highlighted term, a fragment of size `max_num_chars` is created containing the term and its surrounding text. A fragment can contain
multiple highlighted terms if they are within `max_num_chars` distance of one another. By default, `max_num_chars` is set to `150`.

```sql
SELECT id, pdb.snippet(description, max_num_chars => 100)
FROM mock_items
WHERE description @@@ 'shoes'
LIMIT 5;
```

If multiple fragments are found, `pdb.snippet` uses a two-tiered scoring system to determine which fragment to display:

1. Each highlighted term receives a score based on its inverse document frequency. This means that fragments containing rarer terms will score higher.
2. If there is a tie, the fragment that appears earlier in the source text will be displayed.

## Byte Offsets

`pdb.snippet_positions(<column>)` returns the byte offsets in the original text where the snippets would appear. It returns an array of
tuples, where the the first element of the tuple is the byte index of the first byte of the highlighted region, and the second element is the byte index after the last byte of the region.

```sql
SELECT id, pdb.snippet(description), pdb.snippet_positions(description)
FROM mock_items
WHERE description @@@ 'shoes'
LIMIT 5;
```

```ini Expected Response
 id |          snippet           | snippet_positions
----+----------------------------+-------------------
  3 | Sleek running <b>shoes</b> | {"{14,19}"}
  4 | White jogging <b>shoes</b> | {"{14,19}"}
  5 | Generic <b>shoes</b>       | {"{8,13}"}
(3 rows)
```

## Snippet Limit and Offset

Both `pdb.snippet` and `pdb.snippet_positions` accept `limit` and `offset` arguments. A `limit` restricts the number of
highlighted terms, while an `offset` ignores the first `offset` highlighted terms. This can be useful for paginating
through documents that contain large numbers of highlighted terms.

```sql
SELECT id, pdb.snippet(description, "limit" => 1, "offset" => 1)
FROM mock_items
WHERE description @@@ 'shoes' AND description @@@ 'sleek' AND description @@@ 'running';
```

```sql Expected Response
 id |          snippet
----+----------------------------
  3 | Sleek <b>running</b> shoes
(1 row)
```

<Note>
  The `limit` and `offset` arguments must be wrapped in double quotes because
  they are reserved keywords in Postgres.
</Note>

In the output above, notice that `sleek` is not highlighted because an offset of `1` skips the first highlighted term.
Similarly, `shoes` is not highlighted because of the limit `1`.
